Standardizing lexical-semantic resources - Fleshing out the abstract standard LMF
نویسندگان
چکیده
This paper describes the application of the Lexical Markup Framework (LMF) for standardizing lexical-semantic resources in the context of NLP. More specifically, we highlight the question how lexical-semantic resources can be made semantically interoperable by means of LMF and ISOCat. The LMF model UBY-LMF, an instantiation of LMF specifically for NLP, serves as an example to illustrate the path towards semantic interoperability of lexical resources.
منابع مشابه
UBY-LMF - A Uniform Model for Standardizing Heterogeneous Lexical-Semantic Resources in ISO-LMF
We present UBY-LMF, an LMF-based model for large-scale, heterogeneous multilingual lexical-semantic resources (LSRs). UBY-LMF allows the standardization of LSRs down to a fine-grained level of lexical information by employing a large number of Data Categories from ISOCat. We evaluate UBY-LMF by converting nine LSRs in two languages to the corresponding format: the English WordNet, Wiktionary, W...
متن کاملStandardizing Wordnets in the ISO Standard LMF: Wordnet-LMF for GermaNet
It has been recognized for quite some time that sustainable data formats play an important role in the development and curation of linguistic resources. The purpose of this paper is to show how GermaNet, the German version of the Princeton WordNet, can be converted to the Lexical Markup Framework (LMF), a published ISO standard (ISO-24613) for encoding lexical resources. The conversion builds o...
متن کاملUBY - A Large-Scale Unified Lexical-Semantic Resource Based on LMF
We present UBY, a large-scale lexicalsemantic resource combining a wide range of information from expert-constructed and collaboratively constructed resources for English and German. It currently contains nine resources in two languages: English WordNet, Wiktionary, Wikipedia, FrameNet and VerbNet, German Wikipedia, Wiktionary and GermaNet, and multilingual OmegaWiki modeled according to the LM...
متن کاملSubcat-LMF: Fleshing out a standardized format for subcategorization frame interoperability
This paper describes Subcat-LMF, an ISOLMF compliant lexicon representation format featuring a uniform representation of subcategorization frames (SCFs) for the two languages English and German. Subcat-LMF is able to represent SCFs at a very fine-grained level. We utilized SubcatLMF to standardize lexicons with largescale SCF information: the English VerbNet and two German lexicons, i.e., a sub...
متن کاملUsing Standardized Lexical Semantic Knowledge to Measure Similarity
The issue of sentence semantic similarity is important and essential to many applications of Natural Language Processing. This issue was treated in some frameworks dealing with the similarity between short texts especially with the similarity between sentence pairs. However, the semantic component was paradoxically weak in the proposed methods. In order to address this weakness, we propose in t...
متن کامل